Postpartum depression affects a huge number of women and has detrimental consequences. Knowing the factors associated with postpartum depression during pregnancy can help its prevention. Although there is evidence surrounding behavioral or psychological predictors of postpartum depression, there is a lack of evidence of biological forecasters. The aim of this study was to analyze the sociodemographic, obstetric, and psychological variables along with hair cortisol levels during the first, second, and third trimesters of pregnancy that could predict postpartum depression symptoms. A sample of 44 pregnant women was assessed during 3 trimesters of pregnancy and the postpartum period using psychological questionnaires and hair cortisol levels. Participants were divided into 2 groups: a group with postpartum depression symptoms and a group with no postpartum depression symptoms. Results showed significant positive differences between groups in the first trimester regarding the Somatization subscale of the SCL-90-R (p < .05). In the second trimester, significant differences were found in the Somatization, Depression, Anxiety, and GSI subscales (p < .05). In the third trimester significant differences between both groups were found regarding pregnancy-specific stress. We found significant positive differences between groups regarding hair cortisol levels in the first and the third trimester. Hair cortisol levels could predict 21.7% of the variance of postpartum depression symptoms. In conclusion, our study provided evidence that psychopathological symptoms, pregnancy-specific stress, and hair cortisol levels can predict postpartum depression symptoms at different time-points during pregnancy. These findings can be applied in future studies and improve maternal care in clinical settings. [1]
The main data quality issue we encountered was that the variable names were in Spanish, but additionally, we found that the metadata for categorical mapping, eg the original researchers codebook were only accessible through SPSS (.sav file). Becuase of this, it was problematic to interpret their data. Additionally, we considered the data highly curated, poor representation of the population, at risk of error from human entered data and potentially influenced by temporal components that were not accounted for.
Part 2 Rmd Part 2 html code book
The codebook went together fairly well after we got access to the .sav metadata through SPSS (thanks to wei-chun). The components to note on this part was that the datamaid produced a far cleaner codebook and was easier to use then the codebook library.
Our EDA consisted of several main parts. The first part consisted of creating several density plots of cortisol levels over the semesters and then facet wrapping them by postpartum depression outcome. This allowed us to see the large spike in cortisol in the second trimester of women afflicted with postpartum depression that was described to have been found by a previous paper that used urine to test for cortisol levels. We then investigated the relationship between different jobs and cortisol levels over the trimesters. This allowed us to see that there did seem to be jobs that had overall higher cortisol levels something we think may have affected the outcome.
Then we investigated the relationships between, employment status, fetal sex and first or second pregnancy and postpartum depression outcome. These were all investigated individually, showing that being unemployed, being on your second pregnancy or the fetus being male all seemed to cause the patient to be more likely to have postpartum depression. However, we believe that all of these would eventually even out if the sample size was increased. The population used for this study was to small to show convincing causality for any of these factors.
Next, we performed a linear regression between the cortisol levels and the depression score from the Edinburgh Postnatal Depression Scale. This was interesting because it showed us that the scores were often mixed depending on outcome some women scored highly on the test but did not experience postpartum depression, and some women who scored low on the test were afflicted. Also, interestingly for the first two trimesters the correlation is negative, but in the third trimester the correlation switches to become positive. We believe this is due once again to the small sample size and several outliers.
Finally, we compiled a correlation matrix of the values used in the study. We observed strong correlation between several of the psychological variables and the biological ones. This caused us to question the use of these variables as lone predictors of postpartum depression but rather as indicators in a larger mosaic of variables that would cause the disease.
Load Dataset
load('../../data/tidy_data.Rdata')
data_eng <- read.csv('../../data/data_eng.csv')
pp_sad <- df_tidy # rename meaningful
data_eng <- na.omit(data_eng)
data_eng$depreposparto <-as.factor(data_eng$depreposparto)“Differences in sociodemographic, obstetrics variables and depression symptomatology between women with postpartum depression and without postpartum depression.”
In this table, t-test of students used to quantitative variables and Chi-square test to categorical variables. Sport is presented to inform whether participants practiced or did not practice any regular physical activity during pregnancy. Tests indicate no differences between groups in respect to main sociodemographic data, obstetrics, and hair characteristics. There are two significant differences were found between groups on previous miscarriages (\(X^2\) = 4.71, p< .05) and the sex of the fetus (\(X^2\) = 6.03, p< .05).
To replicate this table, we used “dplyr” package to filter, summarise, mutate, and count the dataset to get the “mean”, “standard deviation”, “number”, and “percentage” of each variables in two groups (depression/no depression). “t.test” and “chisq.test” were used to test the difference between no depression and depression groups.
Observations: There are three missing variables. We could not find marital status, labor and delivery, and pain relief in labor from the dataset. Several errors were found, the chi-square of “Nationality”, “Wanted pregnancy”, “Antenatal depression”, and “Pregnancy method” are not correct. In addtion, the number of “Postnatal depression (EPDS)” is wrong. The p-values of all tests are the same or close to the original table. The errors and differences were labeled as red.
#insert images into R Markdown
knitr::include_graphics("table1.png")#Get mean/sd of age and perform two sample t-test of age
pp_sad %>%
group_by(postpartum_depression) %>%
summarise(sd_age = sd(age), mean_age = mean(age))## # A tibble: 2 x 3
## postpartum_depression sd_age mean_age
## <dbl> <dbl> <dbl>
## 1 1 4.06 32.1
## 2 2 3.62 32.9
t.test(pp_sad$age ~ pp_sad$postpartum_depression, var.equal = T) ##
## Two Sample t-test
##
## data: pp_sad$age by pp_sad$postpartum_depression
## t = -0.6779, df = 42, p-value = 0.5016
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -3.302304 1.641589
## sample estimates:
## mean in group 1 mean in group 2
## 32.10714 32.93750
#Get number/percentage of categorical variables.
pp_sad %>%
filter(postpartum_depression == 1) %>%
count(nationality) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## nationality n freq
## <dbl> <int> <dbl>
## 1 0 4 14.3
## 2 1 24 85.7
pp_sad %>%
filter(postpartum_depression == 2) %>%
count(nationality) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## nationality n freq
## <dbl> <int> <dbl>
## 1 0 4 25
## 2 1 12 75
pp_sad %>%
filter(postpartum_depression == 1) %>%
count(employed) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## employed n freq
## <dbl> <int> <dbl>
## 1 0 5 17.9
## 2 1 23 82.1
pp_sad %>%
filter(postpartum_depression == 2) %>%
count(employed) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## employed n freq
## <dbl> <int> <dbl>
## 1 0 5 31.2
## 2 1 11 68.8
pp_sad %>%
filter(postpartum_depression == 1) %>%
count( education_level) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## education_level n freq
## <dbl> <int> <dbl>
## 1 2 4 14.3
## 2 3 24 85.7
pp_sad %>%
filter(postpartum_depression == 2) %>%
count( education_level) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 3 x 3
## education_level n freq
## <dbl> <int> <dbl>
## 1 1 1 6.25
## 2 2 5 31.2
## 3 3 10 62.5
pp_sad %>%
filter(postpartum_depression == 1) %>%
count(sport) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## sport n freq
## <dbl> <int> <dbl>
## 1 0 9 32.1
## 2 1 19 67.9
pp_sad %>%
filter(postpartum_depression == 2) %>%
count(sport) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## sport n freq
## <dbl> <int> <dbl>
## 1 0 9 56.2
## 2 1 7 43.8
pp_sad %>%
filter(postpartum_depression == 1) %>%
count(pet) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## pet n freq
## <dbl> <int> <dbl>
## 1 0 20 71.4
## 2 1 8 28.6
pp_sad %>%
filter(postpartum_depression == 2) %>%
count(pet) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## pet n freq
## <dbl> <int> <dbl>
## 1 0 8 50
## 2 1 8 50
pp_sad %>%
filter(postpartum_depression == 1) %>%
count(dyed_hair) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## dyed_hair n freq
## <dbl> <int> <dbl>
## 1 0 15 53.6
## 2 1 13 46.4
pp_sad %>%
filter(postpartum_depression == 2) %>%
count(dyed_hair) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## dyed_hair n freq
## <dbl> <int> <dbl>
## 1 0 10 62.5
## 2 1 6 37.5
pp_sad %>%
filter(postpartum_depression == 1) %>%
count(first_pregnancy) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## first_pregnancy n freq
## <dbl> <int> <dbl>
## 1 0 8 28.6
## 2 1 20 71.4
pp_sad %>%
filter(postpartum_depression == 2) %>%
count(first_pregnancy) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## first_pregnancy n freq
## <dbl> <int> <dbl>
## 1 0 8 50
## 2 1 8 50
pp_sad %>%
filter(postpartum_depression == 1) %>%
count(wanted_pregnancy) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## wanted_pregnancy n freq
## <dbl> <int> <dbl>
## 1 0 4 14.3
## 2 1 24 85.7
pp_sad %>%
filter(postpartum_depression == 2) %>%
count(wanted_pregnancy) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## wanted_pregnancy n freq
## <dbl> <int> <dbl>
## 1 0 3 18.8
## 2 1 13 81.2
pp_sad %>%
filter(postpartum_depression == 1) %>%
count(pregnancy_method) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## pregnancy_method n freq
## <dbl> <int> <dbl>
## 1 0 6 21.4
## 2 1 22 78.6
pp_sad %>%
filter(postpartum_depression == 2) %>%
count(pregnancy_method) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## pregnancy_method n freq
## <dbl> <int> <dbl>
## 1 0 3 18.8
## 2 1 13 81.2
pp_sad %>%
filter(postpartum_depression == 1) %>%
count(previous_miscarriage) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## previous_miscarriage n freq
## <dbl> <int> <dbl>
## 1 0 24 85.7
## 2 1 4 14.3
pp_sad %>%
filter(postpartum_depression == 2) %>%
count(previous_miscarriage) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## previous_miscarriage n freq
## <dbl> <int> <dbl>
## 1 0 9 56.2
## 2 1 7 43.8
pp_sad %>%
filter(postpartum_depression == 1) %>%
count(fetus_sex) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## fetus_sex n freq
## <int> <int> <dbl>
## 1 0 21 75
## 2 1 7 25
pp_sad %>%
filter(postpartum_depression == 2) %>%
count(fetus_sex) %>%
mutate(freq = n / sum(n)*100)## # A tibble: 2 x 3
## fetus_sex n freq
## <int> <int> <dbl>
## 1 0 6 37.5
## 2 1 10 62.5
#Get number of Antenatal depression
pp_sad %>%
filter(depression_tri1 > 70) %>%
group_by(postpartum_depression) %>%
count() ## # A tibble: 2 x 2
## # Groups: postpartum_depression [2]
## postpartum_depression n
## <dbl> <int>
## 1 1 4
## 2 2 1
pp_sad %>%
filter(depression_tri2 > 70) %>%
group_by(postpartum_depression) %>%
count() ## # A tibble: 2 x 2
## # Groups: postpartum_depression [2]
## postpartum_depression n
## <dbl> <int>
## 1 1 2
## 2 2 4
pp_sad %>%
filter(depression_tri3 > 70) %>%
group_by(postpartum_depression) %>%
count() ## # A tibble: 2 x 2
## # Groups: postpartum_depression [2]
## postpartum_depression n
## <dbl> <int>
## 1 1 5
## 2 2 5
#Create a Antenatal depression dataset for chi-square test
tidy_antDep <- pp_sad %>%
select("depression_tri1", "depression_tri2", "depression_tri3", "postpartum_depression") %>%
gather(trimester, depression, depression_tri1:depression_tri3, -postpartum_depression) %>%
filter(depression > 70)## Warning: attributes are not identical across measure variables;
## they will be dropped
#Chi-square test to categorical variables
chisq.test(pp_sad$postpartum_depression, pp_sad$nationality, correct=FALSE)## Warning in chisq.test(pp_sad$postpartum_depression, pp_sad$nationality, :
## Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: pp_sad$postpartum_depression and pp_sad$nationality
## X-squared = 0.78571, df = 1, p-value = 0.3754
chisq.test(pp_sad$postpartum_depression, pp_sad$employed, correct=FALSE)## Warning in chisq.test(pp_sad$postpartum_depression, pp_sad$employed,
## correct = FALSE): Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: pp_sad$postpartum_depression and pp_sad$employed
## X-squared = 1.0399, df = 1, p-value = 0.3078
chisq.test(pp_sad$postpartum_depression, pp_sad$education_level, correct=FALSE)## Warning in chisq.test(pp_sad$postpartum_depression,
## pp_sad$education_level, : Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: pp_sad$postpartum_depression and pp_sad$education_level
## X-squared = 3.8926, df = 2, p-value = 0.1428
chisq.test(pp_sad$postpartum_depression, pp_sad$sport, correct=FALSE)##
## Pearson's Chi-squared test
##
## data: pp_sad$postpartum_depression and pp_sad$sport
## X-squared = 2.4478, df = 1, p-value = 0.1177
chisq.test(pp_sad$postpartum_depression, pp_sad$pet, correct=FALSE)##
## Pearson's Chi-squared test
##
## data: pp_sad$postpartum_depression and pp_sad$pet
## X-squared = 2.0204, df = 1, p-value = 0.1552
chisq.test(pp_sad$postpartum_depression, pp_sad$dyed_hair, correct=FALSE)##
## Pearson's Chi-squared test
##
## data: pp_sad$postpartum_depression and pp_sad$dyed_hair
## X-squared = 0.33083, df = 1, p-value = 0.5652
chisq.test(pp_sad$postpartum_depression, pp_sad$first_pregnancy, correct=FALSE)##
## Pearson's Chi-squared test
##
## data: pp_sad$postpartum_depression and pp_sad$first_pregnancy
## X-squared = 2.0204, df = 1, p-value = 0.1552
chisq.test(pp_sad$postpartum_depression, pp_sad$wanted_pregnancy, correct=FALSE)## Warning in chisq.test(pp_sad$postpartum_depression,
## pp_sad$wanted_pregnancy, : Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: pp_sad$postpartum_depression and pp_sad$wanted_pregnancy
## X-squared = 0.15168, df = 1, p-value = 0.6969
chisq.test(pp_sad$postpartum_depression, pp_sad$pregnancy_method, correct=FALSE)## Warning in chisq.test(pp_sad$postpartum_depression,
## pp_sad$pregnancy_method, : Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: pp_sad$postpartum_depression and pp_sad$pregnancy_method
## X-squared = 0.044898, df = 1, p-value = 0.8322
chisq.test(pp_sad$postpartum_depression, pp_sad$previous_miscarriage, correct=FALSE)## Warning in chisq.test(pp_sad$postpartum_depression,
## pp_sad$previous_miscarriage, : Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: pp_sad$postpartum_depression and pp_sad$previous_miscarriage
## X-squared = 4.7143, df = 1, p-value = 0.02991
chisq.test(pp_sad$postpartum_depression, pp_sad$fetus_sex, correct=FALSE)##
## Pearson's Chi-squared test
##
## data: pp_sad$postpartum_depression and pp_sad$fetus_sex
## X-squared = 6.0392, df = 1, p-value = 0.01399
chisq.test(tidy_antDep$postpartum_depression, tidy_antDep$trimester, correct=FALSE)## Warning in chisq.test(tidy_antDep$postpartum_depression,
## tidy_antDep$trimester, : Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: tidy_antDep$postpartum_depression and tidy_antDep$trimester
## X-squared = 2.4245, df = 2, p-value = 0.2975
#Check the number of epds >=10
pp_sad %>%
filter(epds >=10) %>%
count## # A tibble: 1 x 1
## n
## <int>
## 1 16
“Mean differences on stress and psychopathological symptoms with interaction effects between groups trimesters.”
This table shows the mean difference of stress and psychopathological scores between postpartum depression and no postpartum depression groups during three trimesters. The meaning of abbreviations are: PDQ = Prenatal Distress Questionnaire; SCL-90-R = Symptom CheckList 90 Revised; SOMS = Somatisation; DEP = Depression; ANX = Anxiety; GSI = Global Severity Index. Significant differences of pregnancy-specific stress levels between both groups were found during the third trimester regarding pregnancy-specific stress (t = -2.67, p = .01). Significant differences between both groups were found during the first trimester regarding the Somatization subscale (t = -2.70, p = .01); during the second trimester regarding the Somatization subscale (t = -2.34, p = .02), the Depression subscale (t = -2.67, p = .01), the Anxiety subscale (t = -3.22, p = .002) and the GSI global index (t = -2.38, p = .02). No significant differences were found between both groups during the third trimester.
To replicate this table, we used “dplyr” package to get the “mean”, “standard deviation”. Two sample t-test (“t.test”) was used for comparising.
Observations:
Several errors were found in first trimester including, t-statisitc and p value of distress, somatization, depression, and anxiety. The mean and standard deviation of somatization are not correct in third trimester. We need to correct that no significant difference was found for Somatization subscale in the first trimester. We think one possible reason of mistakes is: they filled the number in wrong place. For example, the t-statisitc and p value of somatization in first trimester is the same as t-statisitc and p value of distress in first trimister in original table. The errors and differences were labeled as red.
#insert images into R Markdown
knitr::include_graphics("table2.png")#Get mean and sd of all variables
data_eng %>%
group_by(depreposparto) %>%
summarise(mean_PDQ1 = mean(PDQ1), sd_PDQ1=sd(PDQ1), mean_SOMS1 = mean(SOMATIZATIONS1), sd_SOMS1 = sd(SOMATIZATIONS1), mean_DEP1 = mean(DEPRESSION1), sd_DEP1 = sd(DEPRESSION1), mean_anx1 = mean(ANXIETY1), sd_anx1 = sd(ANXIETY1), mean_GSI1 = mean(IGS1), sd_GSI1 = sd(IGS1))## # A tibble: 2 x 11
## depreposparto mean_PDQ1 sd_PDQ1 mean_SOMS1 sd_SOMS1 mean_DEP1 sd_DEP1
## <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 14.0 6.45 57.1 26.3 43.0 23.1
## 2 2 15.8 5.65 70.7 19.3 44.3 18.8
## # ... with 4 more variables: mean_anx1 <dbl>, sd_anx1 <dbl>,
## # mean_GSI1 <dbl>, sd_GSI1 <dbl>
data_eng %>%
group_by(depreposparto) %>%
summarise(mean_PDQ2 = mean(PDQ2), sd_PDQ2=sd(PDQ2), mean_SOMS2 = mean(SOMATIZATIONS2), sd_SOMS2 = sd(SOMATIZATIONS2), mean_DEP2 = mean(DEPRESSION2), sd_DEP2 = sd(DEPRESSION2), mean_anx2 = mean(ANXIETY2), sd_anx2 = sd(ANXIETY2), mean_GSI2 = mean(IGS2), sd_GSI2 = sd(IGS2))## # A tibble: 2 x 11
## depreposparto mean_PDQ2 sd_PDQ2 mean_SOMS2 sd_SOMS2 mean_DEP2 sd_DEP2
## <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 12.6 5.54 41.9 17.8 30.0 18.7
## 2 2 14.2 4.40 56.0 21.4 47.3 23.6
## # ... with 4 more variables: mean_anx2 <dbl>, sd_anx2 <dbl>,
## # mean_GSI2 <dbl>, sd_GSI2 <dbl>
data_eng %>%
group_by(depreposparto) %>%
summarise(mean_PDQ3 = mean(PDQ3), sd_PDQ3=sd(PDQ3), mean_SOMS3 = mean(SOMATIZATIONS3), sd_SOMS3 = sd(SOMATIZATIONS3), mean_DEP3 = mean(DEPRESSION3), sd_DEP3 = sd(DEPRESSION3), mean_anx3 = mean(ANXIETY3), sd_anx3 = sd(ANXIETY3), mean_GSI3 = mean(IGS3), sd_GSI3 = sd(IGS3))## # A tibble: 2 x 11
## depreposparto mean_PDQ3 sd_PDQ3 mean_SOMS3 sd_SOMS3 mean_DEP3 sd_DEP3
## <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 11.2 4.37 50.2 23.0 44.2 27.2
## 2 2 14.8 3.79 65.0 24.7 59.6 22.8
## # ... with 4 more variables: mean_anx3 <dbl>, sd_anx3 <dbl>,
## # mean_GSI3 <dbl>, sd_GSI3 <dbl>
#T-test
t.test(data_eng$PDQ1 ~ data_eng$depreposparto, var.equal = T) ##
## Two Sample t-test
##
## data: data_eng$PDQ1 by data_eng$depreposparto
## t = -0.95508, df = 42, p-value = 0.345
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -5.753489 2.057060
## sample estimates:
## mean in group 1 mean in group 2
## 13.96429 15.81250
t.test(data_eng$SOMATIZATIONS1 ~ data_eng$depreposparto, var.equal = T) ##
## Two Sample t-test
##
## data: data_eng$SOMATIZATIONS1 by data_eng$depreposparto
## t = -1.7938, df = 42, p-value = 0.08005
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -28.718997 1.689888
## sample estimates:
## mean in group 1 mean in group 2
## 57.14286 70.65741
t.test(data_eng$DEPRESSION1 ~ data_eng$depreposparto, var.equal = T) ##
## Two Sample t-test
##
## data: data_eng$DEPRESSION1 by data_eng$depreposparto
## t = -0.18255, df = 42, p-value = 0.856
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -14.95724 12.47568
## sample estimates:
## mean in group 1 mean in group 2
## 43.03948 44.28026
t.test(data_eng$ANXIETY1 ~ data_eng$depreposparto, var.equal = T) ##
## Two Sample t-test
##
## data: data_eng$ANXIETY1 by data_eng$depreposparto
## t = -0.68522, df = 42, p-value = 0.497
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -21.20154 10.45341
## sample estimates:
## mean in group 1 mean in group 2
## 54.63457 60.00864
t.test(data_eng$IGS1 ~ data_eng$depreposparto, var.equal = T) ##
## Two Sample t-test
##
## data: data_eng$IGS1 by data_eng$depreposparto
## t = -0.53474, df = 42, p-value = 0.5956
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -18.76328 10.90255
## sample estimates:
## mean in group 1 mean in group 2
## 53.66349 57.59386
t.test(data_eng$PDQ2 ~ data_eng$depreposparto, var.equal = T) ##
## Two Sample t-test
##
## data: data_eng$PDQ2 by data_eng$depreposparto
## t = -1.0373, df = 42, p-value = 0.3055
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -4.944320 1.587177
## sample estimates:
## mean in group 1 mean in group 2
## 12.57143 14.25000
t.test(data_eng$SOMATIZATIONS2 ~ data_eng$depreposparto, var.equal = T) ##
## Two Sample t-test
##
## data: data_eng$SOMATIZATIONS2 by data_eng$depreposparto
## t = -2.3414, df = 42, p-value = 0.02403
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -26.165372 -1.940608
## sample estimates:
## mean in group 1 mean in group 2
## 41.90358 55.95657
t.test(data_eng$DEPRESSION2 ~ data_eng$depreposparto, var.equal = T) ##
## Two Sample t-test
##
## data: data_eng$DEPRESSION2 by data_eng$depreposparto
## t = -2.6791, df = 42, p-value = 0.0105
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -30.31906 -4.26653
## sample estimates:
## mean in group 1 mean in group 2
## 30.03902 47.33181
t.test(data_eng$ANXIETY2 ~ data_eng$depreposparto, var.equal = T) ##
## Two Sample t-test
##
## data: data_eng$ANXIETY2 by data_eng$depreposparto
## t = -3.2262, df = 42, p-value = 0.002433
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -36.276831 -8.356862
## sample estimates:
## mean in group 1 mean in group 2
## 40.40592 62.72277
t.test(data_eng$IGS2 ~ data_eng$depreposparto, var.equal = T) ##
## Two Sample t-test
##
## data: data_eng$IGS2 by data_eng$depreposparto
## t = -2.3543, df = 42, p-value = 0.02331
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -31.582838 -2.428849
## sample estimates:
## mean in group 1 mean in group 2
## 41.21783 58.22367
t.test(data_eng$PDQ3 ~ data_eng$depreposparto, var.equal = T) ##
## Two Sample t-test
##
## data: data_eng$PDQ3 by data_eng$depreposparto
## t = -2.6784, df = 42, p-value = 0.01051
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -6.1371645 -0.8628355
## sample estimates:
## mean in group 1 mean in group 2
## 11.25 14.75
t.test(data_eng$SOMATIZATIONS3 ~ data_eng$depreposparto, var.equal = T) ##
## Two Sample t-test
##
## data: data_eng$SOMATIZATIONS3 by data_eng$depreposparto
## t = -2.0067, df = 42, p-value = 0.05125
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -29.7682529 0.0843417
## sample estimates:
## mean in group 1 mean in group 2
## 50.19037 65.03233
t.test(data_eng$DEPRESSION3 ~ data_eng$depreposparto, var.equal = T) ##
## Two Sample t-test
##
## data: data_eng$DEPRESSION3 by data_eng$depreposparto
## t = -1.909, df = 42, p-value = 0.06312
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -31.6042263 0.8782022
## sample estimates:
## mean in group 1 mean in group 2
## 44.23573 59.59874
t.test(data_eng$ANXIETY3 ~ data_eng$depreposparto, var.equal = T) ##
## Two Sample t-test
##
## data: data_eng$ANXIETY3 by data_eng$depreposparto
## t = -1.3242, df = 42, p-value = 0.1926
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -27.588431 5.727767
## sample estimates:
## mean in group 1 mean in group 2
## 55.49580 66.42613
t.test(data_eng$IGS3 ~ data_eng$depreposparto, var.equal = T) ##
## Two Sample t-test
##
## data: data_eng$IGS3 by data_eng$depreposparto
## t = -1.4125, df = 42, p-value = 0.1652
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -31.099092 5.490234
## sample estimates:
## mean in group 1 mean in group 2
## 55.63101 68.43544
“Maternal hair cortisol levels and sex of the fetus.”
In this table, two sample t-tests was used to assess whether the sex of the fetus could influence the release of cortisol during pregnancy. No significant differences were found (p > .05). We replicated the table 3 with the same method of table 2.
Observations:
We get the same result from the original table. The mean and standard deviation of cortisol level of male fetus in third trimester are not correct. The errors and differences were labeled as red.
#insert images into R Markdown
knitr::include_graphics("table3.png")data_eng %>%
group_by(SexFetalDico) %>%
summarise(mean_cor1 = mean(LNCORTISOL1), sd_cor1 = sd(LNCORTISOL1), mean_cor2 = mean(LNCORTISOL2), sd_cor2 = sd(LNCORTISOL2), mean_cor3 = mean(LNCORTISOL3), sd_cor3 = sd(LNCORTISOL3))## # A tibble: 2 x 7
## SexFetalDico mean_cor1 sd_cor1 mean_cor2 sd_cor2 mean_cor3 sd_cor3
## <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0 5.09 0.564 5.46 0.401 5.79 0.520
## 2 1 5.50 0.948 5.30 0.553 5.67 0.543
t.test(data_eng$LNCORTISOL1 ~ data_eng$SexFetalDico, var.equal = T) ##
## Two Sample t-test
##
## data: data_eng$LNCORTISOL1 by data_eng$SexFetalDico
## t = -1.8118, df = 42, p-value = 0.07717
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.87103245 0.04690546
## sample estimates:
## mean in group 0 mean in group 1
## 5.087497 5.499560
t.test(data_eng$LNCORTISOL2 ~ data_eng$SexFetalDico, var.equal = T) ##
## Two Sample t-test
##
## data: data_eng$LNCORTISOL2 by data_eng$SexFetalDico
## t = 1.1148, df = 42, p-value = 0.2713
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1300095 0.4509012
## sample estimates:
## mean in group 0 mean in group 1
## 5.464946 5.304500
t.test(data_eng$LNCORTISOL3 ~ data_eng$SexFetalDico, var.equal = T) ##
## Two Sample t-test
##
## data: data_eng$LNCORTISOL3 by data_eng$SexFetalDico
## t = 0.69646, df = 42, p-value = 0.49
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.2163752 0.4444215
## sample estimates:
## mean in group 0 mean in group 1
## 5.787845 5.673822
“Fig 2. SCL-90-R scores throughout pregnancy in both groups.”
This figure shows change of psychopathological symptoms during the three trimester between two groups. The group with postpartum depression symptoms scored higher in every single SCL-90-R subscales during the first, second, and third trimester of pregnancy.
We used “select” and “gather” to convert data from a wide format to a long format. And drawing interaction plots for all SCL-90-R subscales.
Observations:
We obsered two figures are different including, “Interpersonal sensitivity” and “Hostility”. Other replicated figures are nearly the same as original figures.
#insert images into R Markdown
knitr::include_graphics("figure2.png")tidy_SOM <- data_eng %>%
select("X", "SOMATIZATIONS1", "SOMATIZATIONS2", "SOMATIZATIONS3", "depreposparto") %>%
gather(trimester, somatization, SOMATIZATIONS1:SOMATIZATIONS3, -depreposparto)
ggplot(tidy_SOM, aes(x = trimester, y = somatization, colour = depreposparto)) +
stat_summary(fun.y = mean, geom = "point") +
stat_summary(fun.y = mean, geom = "line", size = 1, aes(group= factor(depreposparto))) +
stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.1, size = 0.5) +
labs(x = "Trimesters", y = "Somatization score", colour = "Postpartum Depression")+
ggtitle('Somatization') +
theme_bw() tidy_obs <- data_eng %>%
select("X", "OBSESSIONS.AND.COMPULSIONS1", "OBSESSIONS.AND.COMPULSIONS2", "OBSESSIONS.AND.COMPULSIONS3", "depreposparto") %>%
gather(trimester, obsession, OBSESSIONS.AND.COMPULSIONS1:OBSESSIONS.AND.COMPULSIONS3, -depreposparto)
ggplot(tidy_obs, aes(x = trimester, y = obsession, colour = depreposparto)) +
stat_summary(fun.y = mean, geom = "point") +
stat_summary(fun.y = mean, geom = "line", size = 1, aes(group= factor(depreposparto))) +
stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.1, size = 0.5) +
labs(x = "Trimesters", y = "Obsession compulsion score", colour = "Postpartum Depression")+
ggtitle('Obsession compulsion') +
theme_bw() tidy_is <- data_eng %>%
select("X", "SENSITIVIDADINTERPERSONAL1", "INSTRUMENT.SENSITIVITY2", "INSTRUMENT.SENSITIVITY3", "depreposparto") %>%
gather(trimester, INSTRUMENT, SENSITIVIDADINTERPERSONAL1:INSTRUMENT.SENSITIVITY3, -depreposparto)
ggplot(tidy_is, aes(x = trimester, y = INSTRUMENT, colour = depreposparto)) +
stat_summary(fun.y = mean, geom = "point") +
stat_summary(fun.y = mean, geom = "line", size = 1, aes(group= factor(depreposparto))) +
stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.1, size = 0.5) +
labs(x = "Trimesters", y = "Interpersonal senesitvity", colour = "Postpartum Depression")+
ggtitle('Interpersonal senesitvity') +
theme_bw() tidy_ad <- data_eng %>%
select("X", "DEPRESSION1", "DEPRESSION2", "DEPRESSION3", "depreposparto") %>%
gather(trimester, depression, DEPRESSION1:DEPRESSION3, -depreposparto)
ggplot(tidy_ad, aes(x = trimester, y = depression, colour = depreposparto)) +
stat_summary(fun.y = mean, geom = "point") +
stat_summary(fun.y = mean, geom = "line", size = 1, aes(group= factor(depreposparto))) +
stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.1, size = 0.5) +
labs(x = "Trimesters", y = "Antenatal depression", colour = "Postpartum Depression")+
ggtitle('Antenatal depression') +
theme_bw() tidy_anx <- data_eng %>%
select("X", "ANXIETY1", "ANXIETY2", "ANXIETY3", "depreposparto") %>%
gather(trimester, anxitety, ANXIETY1:ANXIETY3, -depreposparto)
ggplot(tidy_anx, aes(x = trimester, y = anxitety, colour = depreposparto)) +
stat_summary(fun.y = mean, geom = "point") +
stat_summary(fun.y = mean, geom = "line", size = 1, aes(group= factor(depreposparto))) +
stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.1, size = 0.5) +
labs(x = "Trimesters", y = "Anxitety", colour = "Postpartum Depression")+
ggtitle('Anxitety') +
theme_bw() tidy_hos <- data_eng %>%
select("X", "HOSTILIDAD1", "HOSTILITY2", "HOSTILIDAD3", "depreposparto") %>%
gather(trimester, hostility, HOSTILIDAD1:HOSTILIDAD3, -depreposparto)
ggplot(tidy_hos, aes(x = trimester, y = hostility, colour = depreposparto)) +
stat_summary(fun.y = mean, geom = "point") +
stat_summary(fun.y = mean, geom = "line", size = 1, aes(group= factor(depreposparto))) +
stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.1, size = 0.5) +
labs(x = "Trimesters", y = "Hostility", colour = "Postpartum Depression")+
ggtitle('Hostility') +
theme_bw() tidy_pa <- data_eng %>%
select("X", "ANSIEDADFOBICA1", "ANSIEDADFOBICA2", "ANSIEDADFOBICA3", "depreposparto") %>%
gather(trimester, p_anxiety, ANSIEDADFOBICA1:ANSIEDADFOBICA3, -depreposparto)
ggplot(tidy_pa, aes(x = trimester, y = p_anxiety, colour = depreposparto)) +
stat_summary(fun.y = mean, geom = "point") +
stat_summary(fun.y = mean, geom = "line", size = 1, aes(group= factor(depreposparto))) +
stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.1, size = 0.5) +
labs(x = "Trimesters", y = "Phobic anxiety", colour = "Postpartum Depression")+
ggtitle('Phobic anxiety') +
theme_bw() tidy_pi <- data_eng %>%
select("X", "paranoid.ideation1", "paranoid.ideation2", "paranoid.ideation3", "depreposparto") %>%
gather(trimester, paranoid, paranoid.ideation1:paranoid.ideation3, -depreposparto)
ggplot(tidy_pi, aes(x = trimester, y = paranoid, colour = depreposparto)) +
stat_summary(fun.y = mean, geom = "point") +
stat_summary(fun.y = mean, geom = "line", size = 1, aes(group= factor(depreposparto))) +
stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.1, size = 0.5) +
labs(x = "Trimesters", y = "Paranoid ideation", colour = "Postpartum Depression")+
ggtitle('Paranoid ideation') +
theme_bw() tidy_psicoticism <- data_eng %>%
select("X", "PSICOTICISMO1", "PSICOTICISMO2", "PSICOTICISMO3", "depreposparto") %>%
gather(trimester, psicoticism, PSICOTICISMO1:PSICOTICISMO3, -depreposparto)
ggplot(tidy_psicoticism, aes(x = trimester, y = psicoticism, colour = depreposparto)) +
stat_summary(fun.y = mean, geom = "point") +
stat_summary(fun.y = mean, geom = "line", size = 1, aes(group= factor(depreposparto))) +
stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.1, size = 0.5) +
labs(x = "Trimesters", y = "Psicoticism", colour = "Postpartum Depression")+
ggtitle('Psicoticism') +
theme_bw() tidy_IGS <- data_eng %>%
select("X", "IGS1", "IGS2", "IGS3", "depreposparto") %>%
gather(trimester, IGS, IGS1:IGS3, -depreposparto)
ggplot(tidy_IGS, aes(x = trimester, y = IGS, colour = depreposparto)) +
stat_summary(fun.y = mean, geom = "point") +
stat_summary(fun.y = mean, geom = "line", size = 1, aes(group= factor(depreposparto))) +
stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.1, size = 0.5) +
labs(x = "Trimesters", y = "Global Severity Index", colour = "Postpartum Depression")+
ggtitle('Global Severity Index') +
theme_bw() tidy_SP <- data_eng %>%
select("X", "SP1", "SP2", "SP3", "depreposparto") %>%
gather(trimester, SP, SP1:SP3, -depreposparto)
ggplot(tidy_SP, aes(x = trimester, y = SP, colour = depreposparto)) +
stat_summary(fun.y = mean, geom = "point") +
stat_summary(fun.y = mean, geom = "line", size = 1, aes(group= factor(depreposparto))) +
stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.1, size = 0.5) +
labs(x = "Trimesters", y = "Positive symptoms total", colour = "Postpartum Depression")+
ggtitle('Positive symptoms total') +
theme_bw() tidy_PSDI <- data_eng %>%
select("X", "PSDI1", "PSDI2", "PSDI3", "depreposparto") %>%
gather(trimester, PSDI, PSDI1:PSDI3, -depreposparto)
ggplot(tidy_PSDI, aes(x = trimester, y = PSDI, colour = depreposparto)) +
stat_summary(fun.y = mean, geom = "point") +
stat_summary(fun.y = mean, geom = "line", size = 1, aes(group= factor(depreposparto))) +
stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.1, size = 0.5) +
labs(x = "Trimesters", y = "Positive symptoms distress index", colour = "Postpartum Depression")+
ggtitle('Positive symptoms distress index') +
theme_bw() “Hair cortisol levels differences (pg/mg) in each trimester between women with and without postpartum depression symptoms.”
In this figure, hair cortisol levels increased from the first to the third trimester in the group with no postpartum depression symptoms, getting the higher hair cortisol levels at the third trimester. On the other hand, in the group with postpartum depression symptoms, hair cortisol levels decreased from the first to the second trimester and increased in the third trimester.
We used the same method of figure 2. The replicated figure is similar to original figure.
#insert images into R Markdown
knitr::include_graphics("figure3.png")tidy_LNcor <- data_eng %>%
select("X", "LNCORTISOL1", "LNCORTISOL2", "LNCORTISOL3", "depreposparto") %>%
gather(trimester, LNCORTISOL, LNCORTISOL1:LNCORTISOL3, -depreposparto)
tidy_LNcor$depreposparto <- as.factor(tidy_LNcor$depreposparto)
ggplot(tidy_LNcor, aes(x = trimester, y = LNCORTISOL, colour = depreposparto)) +
stat_summary(fun.y = mean, geom = "point") +
stat_summary(fun.y = mean, geom = "line", size = 1, aes(group= factor(depreposparto))) +
stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.1, size = 0.5) +
labs(x = "Trimesters", y = "Hair cortisol levels differences", colour = "Postpartum Depression")+
theme_bw() In addition, a linear regression was carried out to test whether the mothers’ hair cortisol levels could predict postpartum depression symptoms. Results of the regression revealed that hair cortisol levels could predict 21.7% of the variance of postpartum depression symptoms [R2 = .21, (F = 3.703, p < .05)]. More precisely, hair cortisol at the first trimester (p < .05) and the third trimester (p < .05) significantly predicted the EPDS scores.
We built up a parallel linear regression model by using cortisol level from three trimesters to predict EPDS scores. Our got similar result of the original paper.
lm_ppd <- lm(epds ~ cortisol_tri1 + cortisol_tri2 + cortisol_tri3, data = pp_sad)
lm_ppd##
## Call:
## lm(formula = epds ~ cortisol_tri1 + cortisol_tri2 + cortisol_tri3,
## data = pp_sad)
##
## Coefficients:
## (Intercept) cortisol_tri1 cortisol_tri2 cortisol_tri3
## 4.189162 0.005903 -0.002903 0.008174
summary(lm_ppd)##
## Call:
## lm(formula = epds ~ cortisol_tri1 + cortisol_tri2 + cortisol_tri3,
## data = pp_sad)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.532 -3.231 -0.614 2.049 17.145
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.189162 1.880312 2.228 0.0316 *
## cortisol_tri1 0.005903 0.002720 2.170 0.0360 *
## cortisol_tri2 -0.002903 0.006415 -0.453 0.6533
## cortisol_tri3 0.008174 0.003613 2.262 0.0292 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.61 on 40 degrees of freedom
## Multiple R-squared: 0.2177, Adjusted R-squared: 0.1591
## F-statistic: 3.711 on 3 and 40 DF, p-value: 0.01907
This extension has two parts. First, we look at how a MLR preforms when including all paper reported statistically significant variables. We do this to highlight that their findings are not externally valid or p-adjusted and therefore should not perform well in a model. As we expeceted, the colinearity between variables results in only one variable being signifincat in the MLR. Interestingly, the model still performs fairly well when applied to a test set, although, we really don’t have a large enough dataset to validate this. The second approach we took was to extract a new feature combining all the cortisol data. We did this by calculating the area under the curve (AUC) and then fitting a probit regression to predict postpartum depression. This model performed admirably and could be a useful tool to predict postpartum depression, although, the specificity was rather low.
lm_ppd2 <- lm(epds ~ cortisol_tri1 + cortisol_tri3 + depression_tri2 + depression_tri1 + depression_tri3 + employed + first_pregnancy + fetus_sex, data = pp_sad)
lm_ppd2##
## Call:
## lm(formula = epds ~ cortisol_tri1 + cortisol_tri3 + depression_tri2 +
## depression_tri1 + depression_tri3 + employed + first_pregnancy +
## fetus_sex, data = pp_sad)
##
## Coefficients:
## (Intercept) cortisol_tri1 cortisol_tri3 depression_tri2
## 2.923904 0.001719 0.005820 0.114831
## depression_tri1 depression_tri3 employed first_pregnancy
## -0.081802 0.044738 -1.028279 -0.451426
## fetus_sex
## 1.868121
summary(lm_ppd2)##
## Call:
## lm(formula = epds ~ cortisol_tri1 + cortisol_tri3 + depression_tri2 +
## depression_tri1 + depression_tri3 + employed + first_pregnancy +
## fetus_sex, data = pp_sad)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.958 -2.594 0.127 1.670 14.434
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.923904 2.668826 1.096 0.28075
## cortisol_tri1 0.001719 0.002838 0.606 0.54849
## cortisol_tri3 0.005820 0.003385 1.719 0.09445 .
## depression_tri2 0.114831 0.036149 3.177 0.00311 **
## depression_tri1 -0.081802 0.042912 -1.906 0.06485 .
## depression_tri3 0.044738 0.032346 1.383 0.17539
## employed -1.028279 1.534480 -0.670 0.50718
## first_pregnancy -0.451426 1.388943 -0.325 0.74711
## fetus_sex 1.868121 1.394651 1.339 0.18904
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.031 on 35 degrees of freedom
## Multiple R-squared: 0.4767, Adjusted R-squared: 0.3571
## F-statistic: 3.985 on 8 and 35 DF, p-value: 0.001917
#Performance
##### Train Test Split
library(caTools)
set.seed(101)
sample <- sample.split(pp_sad, SplitRatio = 0.75)
train <- subset(pp_sad, sample==T)
test <- subset(pp_sad, sample==F)
lm_fit <- lm(epds ~ cortisol_tri1 + cortisol_tri3 + depression_tri2 + depression_tri1 + depression_tri3 + employed + first_pregnancy + fetus_sex, data = train)
summary(lm_fit)##
## Call:
## lm(formula = epds ~ cortisol_tri1 + cortisol_tri3 + depression_tri2 +
## depression_tri1 + depression_tri3 + employed + first_pregnancy +
## fetus_sex, data = train)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.9086 -2.5061 -0.2333 1.9720 14.1160
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.523787 4.048402 1.611 0.12072
## cortisol_tri1 0.001589 0.003320 0.479 0.63674
## cortisol_tri3 0.004109 0.004200 0.978 0.33813
## depression_tri2 0.123018 0.043721 2.814 0.00985 **
## depression_tri1 -0.090673 0.052098 -1.740 0.09515 .
## depression_tri3 0.052860 0.042032 1.258 0.22115
## employed -3.365812 2.480569 -1.357 0.18799
## first_pregnancy -1.706653 1.930058 -0.884 0.38571
## fetus_sex 1.588408 1.766209 0.899 0.37780
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.436 on 23 degrees of freedom
## Multiple R-squared: 0.5063, Adjusted R-squared: 0.3345
## F-statistic: 2.948 on 8 and 23 DF, p-value: 0.02008
## RMSE
# Predict in-sample
predicted <- predict(lm_fit, test)
# Calculate RMSE
actual <- test$epds
sqrt(mean((predicted - actual)^2))## [1] 3.868171
##predict in test model
test$lm_predict <- predict(lm_fit, test)
test$obs_PPD <- ifelse(test$epds < 10, "OB_No_PPD", "OB_PPD")
test$predict_PPD <- ifelse(test$lm_predict < 10, "No PPD", "PPD")
# Table
table(test$predict_PPD, test$obs_PPD)##
## OB_No_PPD OB_PPD
## No PPD 7 0
## PPD 2 3
library(pROC)## Type 'citation("pROC")' for a citation.
##
## Attaching package: 'pROC'
## The following objects are masked from 'package:stats':
##
## cov, smooth, var
# ROC curve
roc_PPD <- roc(obs_PPD ~ lm_predict, data = test, percent = T)
print(roc_PPD)##
## Call:
## roc.formula(formula = obs_PPD ~ lm_predict, data = test, percent = T)
##
## Data: lm_predict in 9 controls (obs_PPD OB_No_PPD) < 3 cases (obs_PPD OB_PPD).
## Area under the curve: 96.3%
#Drow ROC curve
plot.roc(roc_PPD, print.auc=TRUE, col="blue")\[ AUC = \int_{0}^{6} (\beta x + C) \space dx \\ ==> 18B + 6C \]
load('../../data/tidy_data.Rdata')
pp_sad <- df_tidy %>% mutate(postpartum_depression=postpartum_depression-1)AUC <- vector(mode = 'numeric', length=44)
for (i in seq(1,44,1)){
patient <- pp_sad %>% filter(row_number() == i)
y <- c(patient$cortisol_tri1, patient$cortisol_tri2, patient$cortisol_tri3)
x <- c(2, 5, 8) #months, what are the actual time frames we should use?
AUC[i] <- AUC(x, y, method = "trapezoid", na.rm = FALSE) # this uses interpolation between our data points instead of lm fit
}
pp_sad <- pp_sad %>% mutate(Cortisol_AUC = AUC)
pp_sad %>% ggplot(aes(x=Cortisol_AUC, y=postpartum_depression)) + geom_point() + geom_smooth(method='glm', method.args=list(family=binomial(link="probit")))pp_sad %>% ggplot(aes(x=Cortisol_AUC, group=postpartum_depression, fill=as.factor(postpartum_depression), alpha=0.25)) + geom_density()pp_probit <- glm('postpartum_depression ~ Cortisol_AUC', family = binomial(link = 'probit'), data = pp_sad)
t.tst <- t.test(Cortisol_AUC ~ as.factor(postpartum_depression), data=pp_sad, na.rm=TRUE)
t.tst##
## Welch Two Sample t-test
##
## data: Cortisol_AUC by as.factor(postpartum_depression)
## t = -3.342, df = 21.786, p-value = 0.00298
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1276.792 -298.624
## sample estimates:
## mean in group 0 mean in group 1
## 1381.193 2168.901
summary(pp_probit)##
## Call:
## glm(formula = "postpartum_depression ~ Cortisol_AUC", family = binomial(link = "probit"),
## data = pp_sad)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.8630 -0.7378 -0.5654 0.9088 2.2126
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.0616392 0.5998198 -3.437 0.000588 ***
## Cortisol_AUC 0.0010041 0.0003301 3.042 0.002350 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 57.682 on 43 degrees of freedom
## Residual deviance: 45.732 on 42 degrees of freedom
## AIC: 49.732
##
## Number of Fisher Scoring iterations: 4
pp_sad <- pp_sad %>% mutate(pred = ifelse(plogis(predict(pp_probit, pp_sad)) > 0.5, 1, 0)) # predicted scores
TP <- pp_sad %>% filter(pred == 1 & postpartum_depression == 1.0) %>% nrow()
FP <- pp_sad %>% filter(pred == 1 & postpartum_depression == 0.0) %>% nrow()
TN <- pp_sad %>% filter(pred == 0 & postpartum_depression == 0.0) %>% nrow()
FN <- pp_sad %>% filter(pred == 0 & postpartum_depression == 1.0) %>% nrow()
sensitivity <- TP / (TP+FN)
specificity <- TN / (TN + FP)
truth.tab <- data.frame(TP,TN,FP,FN, specificity, sensitivity)
truth.tab## TP TN FP FN specificity sensitivity
## 1 9 26 2 7 0.9285714 0.5625
This extension shows value as a feature extraction method and resulted in a statistically significant predictive variable.
[1] Caparros-Gonzalez, Rafael A., et al. “Hair Cortisol Levels, Psychological Stress and Psychopathological Symptoms as Predictors of Postpartum Depression.” PLOS ONE, Public Library of Science, journals.plos.org/plosone/article/metrics?id=10.1371/journal.pone.0182817#citedHeader.